Systems and methods for live media content matching

ABSTRACT

Systems and methods for matching media content are disclosed, including: at a server, obtaining first media content from a client device, wherein the first media content item corresponds to a first portion of media content being played on the client device; obtaining second media content from a content source distinct from the server; comparing the first media content and the second media content; based on a determination that the second media content corresponds to a portion of the media content that is earlier than the first media content: obtaining third media content from the content source corresponding to a third portion of the media content subsequent to the second media content; comparing the first media content with the third media content; and based on a determination that the first and third media content are concurrent, identifying the first media content using identification information corresponding to the third media content.

RELATED APPLICATIONS

This application is a continuation of U.S. patent application Ser. No.15/603,357, titled “Systems and Methods for Live Media ContentMatching,” filed May 23, 2017, which is a continuation of U.S. patentapplication Ser. No. 13/623,031, titled “Systems and Methods for LiveMedia Content Matching,” filed Sep. 19, 2012, now U.S. Pat. No.9,661,361, issued May 23, 2017, which is hereby incorporated byreference herein in its entirety.

This application is related to U.S. patent application Ser. No.13/536,779, titled “Generating a Sequence of Audio Fingerprints at a SetTop Box,” filed Jun. 28, 2012, now U.S. Pat. No. 9,113,203, which ishereby incorporated by reference herein in its entirety.

TECHNICAL FIELD

The disclosed implementations relate generally to matching live mediacontent.

BACKGROUND

Determining what media content a user is consuming (e.g.,watching/listening to) has become increasingly important. For example,an online merchant might be interested in finding out which TV episodeor movie a user is watching so that targeted promotion materials can besent to the user. For another example, a recording company might want torecommend music similar to an audio stream being played on a user'scomputer. Typically, the media content a user is consuming can bedetermined by first sampling the user media content, and then comparingit with known media content provided by external sources. Difficultiesarise; however, when media content received from the external sources donot accurately reflect what is being consumed by the user. For example,when the media content received from the external sources issignificantly delayed, and as a result, lags behind the media contentbeing consumed by the user.

BRIEF DESCRIPTION OF THE DRAWINGS

The implementations disclosed herein are illustrated by way of example,and not by way of limitation, in the figures of the accompanyingdrawings. Like reference numerals refer to corresponding partsthroughout the drawings.

FIG. 1 is a block diagram illustrating a distributed client-serversystem, in accordance with some implementations.

FIG. 2 is a block diagram illustrating a client device, in accordancewith some implementations.

FIG. 3 is a block diagram illustrating a server system, in accordancewith some implementations.

FIG. 4 is a block diagram illustrating one or more content feeds, inaccordance with some implementations.

FIG. 5 is a flow chart illustrating a method for matching live mediacontent in a client-server system, in accordance with someimplementations.

FIG. 6 is a flow chart illustrating a method for matching live mediacontent at a server system, in accordance with some implementations.

DETAILED DESCRIPTION

The implementations described herein provide techniques for matchingmedia content being displayed on a client device (e.g., a connected TV,a Google TV-enabled device) with known media content provided by contentfeeds, so as to determine or identify what media content is beingdisplayed on the client device. These techniques are beneficial becausethey allow client devices, such as a connected TV or a Google TV-enableddevice, to identify media content received via digital connections(e.g., HDMI), even when the media content is received withoutcorresponding channel information.

First, in some implementations, a server receives, from a client device,first media content, which is a portion of media content being played onthe client device (e.g., the second 10 seconds of a TV episode a user iswatching on a Google TV). In some implementations, the first mediacontent also has an expiration time (e.g., a 3 seconds window) toaccount for network delays between the client device and the server.

Next, in some implementations, the server receives, from one or morecontent feeds, second media content—also a portion of the media contentbeing played on the client device (e.g., the first 10 seconds of the TVepisode the user is watching on the Google TV). In some implementations,the server also receives information identifying the second mediacontent (e.g., a title of the TV episode).

Then, in some implementations, the server determines whether the secondmedia content has already been played on the client device. For example,the server determines whether the second media content represents aportion that is before or after the first media content. In someimplementations, when the second media content has already beendisplayed on the client—e.g., the media content received from thecontent feeds lags behind the media content received from the clientdevice—the server determines that the content feeds are slower inproviding media content. In accordance with this determination, theserver, after a certain amount of time (e.g., 50 milliseconds) passes(to allow the content feeds to catch up with the client device),receives third media content from the content feeds.

In some implementations, before the expiration time, the server receivesthe third media—a subsequent portion to the second media content (e.g.,the second 10 seconds of the TV episode the user is watching)—andinformation identifying the third media content (e.g., the title of theTV episode). In some implementations, the server then compares the firstmedia content with the third media content. When there is a matchbetween the first and third media content, the server, using theinformation identifying the third media content, identifies what mediacontent is being displayed on the client device.

In this way, even when the content feed is slower than the client devicein providing the matching media content, the server is still able toascertain what media content is being played on the client device, sothat further actions can be taken. For example, after the serverdetermines that a user is watching a Law and Order episode, another Lawand Order episode, an episode of CSI (a program similar to Law andOrder) and promotional materials may be provided to the user. Additionaldetails of implementations are now described in relation to the Figures.

FIG. 1 is a block diagram illustrating a distributed client-serversystem 100 for matching live media content. In some implementations, thedistributed client-server system 100 includes a client device 102(referred to herein as “client 102,” e.g., a connected TV or a GoogleTV), a broadcast system 104, a server system 106 (referred to herein as“server 106”), one or more content feeds 108, and one or morecommunication networks 110.

In some implementations, the client 102 receives media content 134 fromthe broadcast system 104 and/or the content feeds 108. In someimplementations, the client 102 also displays a representation of thereceived media content 134. In some implementations, the client 102optionally sends client data 136 and content fingerprint 132, orrespective portion(s) thereof, to the broadcast system 104 and to theserver 106 (via the one or more communication networks 110),respectively. In some implementations, the client 132 also sends mediacontent 134 to the broadcast system 104 and to the server 106.

In some implementations, the client 102 includes a media contentprocessing module 112, which processes (e.g., plays) the received mediacontent 134 for a user to consume (e.g., watch and/or listen to). Insome implementations, the media content processing module 112,optionally, generates one or more fingerprints of the media contentbeing played on the client 102, and transmits the one or more contentfingerprints 132 to the server 106. In some implementations, the mediacontent processing module 112 includes a general purpose Internetbrowser (sometimes called a Web browser) having a plug-in (e.g., aGoogle TV plug-in) for playing the received media content 134. In someimplementations, the media content processing module 112 includes astand-alone media player.

In some implementations, the broadcast system 104 optionally includes aTV broadcast system, a radio broadcast system, a computer network, othertypes of networks, or a combination of these systems.

In some implementations, the server 106 includes a front end server 116,a comparison module 118, a content database 120, a fingerprint database122, and, optionally, a fingerprint generation module 124. The front endserver 116, in some implementations, receives respective media content134 (or the content fingerprint 132), from the client 102 and thecontent feeds 108. The front end server 116, in some implementations,also sends comparison data 138 to the client 102, via the communicationnetwork(s) 110. In some implementations, the comparison module 118compares media content and/or corresponding fingerprint(s) received fromthe client 102 with that received from the content feeds 108. In someimplementations, the content database 120 stores media content receivedfrom the client 102 and from the content feeds 108. In someimplementations, the fingerprint database 122 stores one or morefingerprints for media content received from the client 102 and from thecontent feeds 108. In some implementations, optionally, the fingerprintgeneration module 124 generates one or more fingerprints for thereceived media content 134, or portion(s) thereof.

In some implementations, the content feeds 108 include one or morecontent feeds that provide media streams, including media content beingplayed on the client 102, or portion(s) thereof, to the server 106. Insome implementations, the content feeds 108 include a video content feed126, an audio content feed 128, and, optionally, one other content feed130. In some implementations, the video content feed 126 provides, tothe server 106, one or more video media streams (e.g., TV episodes,movies, and YouTube videos), including video media content being playedon the client 102. In some implementations, the audio content feed 128provides, to the server 106, one or more audio media streams (e.g.,songs, podcasts, and audio books) or audio portions of video mediastreams, including audio media content being played on the client 102.In some implementations, the other content feed 130 provides, to theserver 106, one or more other media streams, including media contentbeing played on the client 102.

In some implementations, the communication network(s) 110 optionallyincludes the Internet, one or more local area networks (LANs), one ormore wide area networks (WANs), other types of networks, or acombination of such networks.

FIG. 2 is a block diagram illustrating a client device 102 (“client102,” also called a client) in accordance with some implementations. Theclient device 102, in some implementations, includes one or moreprocessing units CPU(s) 201 (also herein referred to as processors), auser interface 202 (comprising a keyboard and/or mouse 203, touchpad,touchscreen or other input device), one or more network interfaces 204,memory 206, media content input/output 230, TV remote control 232, andone or more communication buses 205 for interconnecting thesecomponents. The communication buses 205 optionally include circuitry(sometimes called a chipset) that interconnects and controlscommunications between system components. The memory 206 typicallyincludes high-speed random access memory, such as DRAM, SRAM, DDR RAM orother random access solid state memory devices; and optionally includesnon-volatile memory, such as one or more magnetic disk storage devices,optical disk storage devices, flash memory devices, or othernon-volatile solid state storage devices. The memory 206 optionallyincludes one or more storage devices remotely located from the CPU(s)201. The memory 206, or alternatively the non-volatile memory device(s)within the memory 206, comprises a non-transitory computer readablestorage medium. In some implementations, the memory 206 or alternativelythe non-transitory computer readable storage medium stores the followingprograms, modules and data structures, or a subset thereof:

-   -   an operating system 208, which includes procedures for handling        various basic system services and for performing hardware        dependent tasks;    -   a network communication module (or instructions) 210 for        connecting the client 102 with other devices (e.g., the        broadcast system 104 or the server 106) via one or more network        interfaces 204 (wired or wireless), and the one or more        communication networks 110 (FIG. 1); and    -   client data 136 stored on the client 102, which include:        -   media content 212-i (media content 212-1 through media            content 212-n) for including media content and/or            corresponding meta-data, received from the broadcast system            104 or the content feeds 108, for a user associated with the            client 102 to consume, including:            -   audio/video content 214-i for including audio and/or                media content, or portion(s) thereof, corresponding to                the media content 212-i;            -   meta-data 216-i for including meta-data (e.g., title,                length, timestamps, or access history) associated with                the audio/video content 214-i; and            -   optionally, content fingerprint 218-i for identifying                audio/video content 214-i, or portion(s) thereof.

In some implementations, the client 102 also includes a display 114, amedia content input/output 230, and a TV remote control 232. In someimplementations, the display 114 includes a TV screen or a computermonitor. In some implementations, the media content input/output 230includes a line-in/out port, and the TV remote control 232 includes a TVremote control device for controlling the client 102. In otherimplementations, the client 102 is a phone and includes a second screenapplication, which captures voice samples of the media content 134.

In some implementations, one or more of the above identified elementsare stored in one or more of the previously mentioned memory devices,and correspond to a set of instructions for performing a functiondescribed above. The above identified modules or programs (e.g., sets ofinstructions) need not be implemented as separate software programs,procedures or modules, and thus various subsets of these modules may becombined or otherwise re-arranged in various implementations. In someimplementations, the memory 206 optionally stores a subset of themodules and data structures identified above. Furthermore, the memory206 may store additional modules and data structures not describedabove.

FIG. 3 is a block diagram illustrating the server system 106 (“server106,” also called a server), in accordance with some implementations.The server 106 typically includes one or more processing units CPU(s)302 (also herein referred to as processors), one or more network orother communications interfaces 304, the memory 306, and one or morecommunication buses 305 for interconnecting these components. Thecommunication buses 305 optionally include circuitry (sometimes called achipset) that interconnects and controls communications between systemcomponents. The memory 306 includes high-speed random access memory,such as DRAM, SRAM, DDR RAM or other random access solid state memorydevices; and optionally includes non-volatile memory, such as one ormore magnetic disk storage devices, optical disk storage devices, flashmemory devices, or other non-volatile solid state storage devices. Thememory 306 optionally includes one or more storage devices remotelylocated from CPU(s) 302. The memory 306, or alternatively thenon-volatile memory device(s) within the memory 306, comprises anon-transitory computer readable storage medium. In someimplementations, the memory 306 or alternatively the non-transitorycomputer readable storage medium stores the following programs, modulesand data structures, or a subset thereof:

-   -   an operating system 308, which includes procedures for handling        various basic system services and for performing hardware        dependent tasks;    -   a network communication module (or instructions) 310 for        connecting the server 106 with other devices (e.g., the client        102 or the content feeds 108) via the one or more network        interfaces 304 (wired or wireless), and the communication        network(s) 110 (FIG. 1);    -   a front end server 116 for receiving media content 134 and/or        corresponding content fingerprint 132, from the client 102 and        from the content feeds 108, and transmitting comparison data 138        to other devices for further actions thereon (e.g., to a server        that generates advertising materials based on comparison data        138);    -   a comparison module 118 for comparing media content or        portion(s) thereof received from the client 102 with that        received from the content feeds 108, which includes:        -   a fingerprint matching module 312 for matching            fingerprint(s) for media content received from the client            102 with fingerprint(s) for media content received from the            content feeds 108;    -   a content database 120 for storing, on the server 106, media        content received from the client 102 or from the content feeds        108, or portion(s) thereof, including:        -   media content 212-i (media content 212-1 through media            content 212-n) for including media content and related data            received from the broadcast system 104 or from the content            feeds 108, for a user associated with the client 102 to            consume (e.g., watch or listen to), including:            -   audio/video content 214-i for including audio or video                media content, or portion(s) thereof, corresponding to                the media content 212-i; and            -   meta-data 216-i for including meta-data (e.g., title,                length, timestamps, or access history) associated with                the audio/video content 214-i; and    -   a fingerprint database 122 for storing, on the server 106,        fingerprints received from the client 102 or from the content        feeds 108, or fingerprints generated by the fingerprint        generation module 124, which includes:        -   content fingerprint 218-i (content fingerprint 218-1 through            content fingerprint 218-n) for including content fingerprint            corresponding to the media content 212-i, including:            -   an identifier 320-i for uniquely identifying a content                fingerprint;            -   audio information 322-i for including a fingerprint or                other compressed representation of a predefined portion                (e.g., a predefined number of seconds or minutes) of an                audio stream or the audio portion of a video stream;            -   video information 324-i for including a fingerprint or                other compressed representation of a predefined portion                (e.g., a predefined number of frames, seconds or                minutes) of a video stream; and            -   a list of associated files 326-i for including one or                more files that include the media content associated                with the content fingerprint 218-i, or pointers to the                one or more files;    -   optionally, a fingerprint generation module 124 for generating        one or more fingerprints using media content received from the        client 102 or that received from the content feeds 108; and    -   comparison data 138 for including result(s) for comparing        fingerprint(s) corresponding to media content received from the        client 102 with fingerprint(s) corresponding to media content        received from the content feeds 108.

In some implementations, one or more of the above identified elementsare stored in one or more of the previously mentioned memory devices,and corresponds to a set of instructions for performing a functiondescribed above. The above identified modules or programs (e.g., sets ofinstructions) need not be implemented as separate software programs,procedures or modules, and thus various subsets of these modules may becombined or otherwise re-arranged in various implementations. In someimplementations, the memory 306 optionally stores a subset of themodules and data structures identified above. Furthermore, Memory 306optionally stores additional modules and data structures not describedabove.

Although FIG. 3 shows a “server system 106,” also referred to as aserver, FIG. 3 is intended more as functional description of the variousfeatures which may be present in backend system than as a structuralschematic of the implementations described herein. In practice, and asrecognized by those of ordinary skill in the art, items shown separatelycould be combined and some items could be separated.

FIG. 4 is a block diagram illustrating the one or more content feeds 108(“content feeds 108”), in accordance with some implementations. Thecontent feeds 108 typically include one or more processing units CPU(s)402 (also herein referred to as processors), one or more network orother communications interfaces 404, memory 406, and one or morecommunication buses 405 for interconnecting these components. Thecommunication buses 405 optionally include circuitry (sometimes called achipset) that interconnects and controls communications between systemcomponents. The memory 406 includes high-speed random access memory,such as DRAM, SRAM, DDR RAM or other random access solid state memorydevices; and optionally includes non-volatile memory, such as one ormore magnetic disk storage devices, optical disk storage devices, flashmemory devices, or other non-volatile solid state storage devices. Thememory 406 optionally includes one or more storage devices remotelylocated from CPU(s) 402. The memory 406, or alternatively thenon-volatile memory device(s) within the memory 406, comprises anon-transitory computer readable storage medium. In someimplementations, the memory 406, or alternatively the non-transitorycomputer readable storage medium stores the following programs, modulesand data structures, or a subset thereof:

-   -   an operating system 408, which includes procedures for handling        various basic system services and for performing hardware        dependent tasks;    -   a network communication module (or instructions) 410 for        connecting content feeds 108 with other devices (e.g., the        client 102, the broadcast system 104 or the server 106) via the        one or more network interfaces 404 (wired or wireless), and the        communication network(s) 110 (FIG. 1);    -   video content feed(s) 126 for providing video media content,        including media content being displayed on the client 102,        including:        -   video content 412 for including video media content (e.g., a            TV episode, a movie, a YouTube steam) received from the            broadcast system 104 or from the content feeds 108;        -   meta-data 414 for including meta-data associated with the            video content 412; and        -   optionally, content fingerprint 416 for including            fingerprint(s) corresponding to the video content 412 or            portion(s) thereof;    -   audio content feed(s) 128 for providing audio media content or        audio portion of video media content, including:        -   audio content 418 for including audio media content (e.g., a            song, an audio book, or an audio recording) received from            the broadcast system 104 or from the content feeds 108;        -   meta-data 420 for including meta-data associated with the            audio content 418; and        -   optionally, content fingerprint 422 for including            fingerprint(s) corresponding to the audio content 418, or            portion(s) thereof; and    -   other content feed(s) 130 for providing other media content,        including media content being displayed on the client 102        -   other content 424 for including other media content received            from the broadcast system 104 or from the content feeds 108;        -   meta-data 426 for including meta-data associated with the            other content 424; and        -   optionally, content fingerprint 428 for including            fingerprint(s) corresponding to the other content 424, or            portion(s) thereof.

In some implementations, one or more of the above identified elementsare stored in one or more of the previously mentioned memory devices,and corresponds to a set of instructions for performing a functiondescribed above. The above identified modules or programs (e.g., sets ofinstructions) need not be implemented as separate software programs,procedures or modules, and thus various subsets of these modules may becombined or otherwise re-arranged in various implementations. In someimplementations, the memory 406 optionally stores a subset of themodules and data structures identified above. Furthermore, the memory406 optionally stores additional modules and data structures notdescribed above.

Although FIG. 4 shows one “content feeds 108.” FIG. 4 is intended moreas functional description of the various features which may be presentin backend system than as a structural schematic of the implementationsdescribed herein. In practice, and as recognized by those of ordinaryskill in the art, items shown separately could be combined and someitems could be separated.

FIG. 5 includes a flowchart illustrating a method 500 for matching livemedia content in a client-server system, in accordance with someimplementations. In some implementations, the method 500 is performed byone or more modules on the client 102.

In some implementations, the client 102 sends (502) first media contentto the server 106. In some implementations, the content feeds 108 alsosend (504) second media content to server 106. In some implementations,the first media content and the second media content are sent to theserver 106 at or around the same time. In some implementations, thefirst media content and the second media content are sent to the server106 within a predefined time interval (e.g., within 5 milliseconds) ofeach other.

In some implementations, the first media content is media content, or aportion thereof, the client 102 received from a content provider (e.g.,YouTube), the second media content is media content, or a portionthereof, the content feeds 108 received from the same or another contentprovider, matching the first media content.

In some implementations, at or around substantially the same time (e.g.,live), the server 106 obtains (506, 508) the first and second mediacontent. In some implementations, to account for any potential delays,the first and second media content are obtained within a predefined timeinterval (e.g., 1 millisecond) of each other. In some implementations,the potential delays include delays (e.g., delays due to networkperformance) between the client 102 and the server 106, and delaysbetween the content feeds 108 and the server 106.

In some implementations, after the first and second media content areobtained, the server 106, optionally, compares (510) the first andsecond media content, to determine whether media content received fromthe client 102 (e.g., the first media content) matches media content(e.g., the second media content) received from the content feeds 108,and if so, using meta-data (e.g., title information) associated with thesecond media content to identify what media content (e.g., the firstmedia content) is being played on client 102.

In other implementations, the comparison between the first and secondmedia content occurs on other devices, such as on the client 102 or onother server(s) 106, and result thereof is transmitted afterwards to theserver 106. In some embodiments, comparing the first and second mediacontent includes comparing the fingerprint of the first media contentwith the fingerprint of the second media content. In some embodiments,media content fingerprints are generated by the server 106 (e.g. by thefingerprint generation module 124). In other embodiments, media contentfingerprints are generated by the client 102 (e.g., by the media contentprocessing module 112) and/or by content feeds 108, and then transmittedto the server 106 for comparison.

In some implementations, after the first and second media content arecompared, the server 106 optionally determines (512) whether the secondmedia content corresponds to a portion of the media content that hasbeen played on the client. For example, in some implementations, thefirst media content includes a portion of the media content being playedon the client 102 at a predefined time, and the second media contentincludes another portion of the media content being played on the client102. In some implementations, both the first and second media contentare portions of same video content (e.g., a same TV episode or a sameYouTube stream) or same audio content (a same MP3 song or a same audiosteam) being played on the client 102. In some implementations,therefore, based on the comparison between the first and second mediacontent, the server 106 determines whether the second media content(received from content feeds) is (1) an earlier portion to the firstmedia content—e.g., the second media content has been played on theclient 102; (2) a subsequent portion to the first media content—e.g.,the second media content has been played on the client 102; or (3)concurrent with (e.g., substantially the same as) the first mediacontent.

In some implementations, if the second media content corresponds to aportion of the media content that has been played on the client 102(“Yes,” 514)—e.g., the second media content has already been played onthe client 102—the content feeds 108 send (516) third media content tothe server 106. In some implementations, the third media content ismedia content, or a portion thereof, the server 106 received from thecontent provider that provides the first and second media content.

In some implementations, the first media content is associated with apredefined expiration time (e.g., as specified in a timestamp associatedwith the first media content). In some implementations, before thepredefined expiration time, the server 106 obtains (518) the third mediacontent from the content feeds 108. In some implementations, thepredefined expiration time is sufficient to ensure that the second andthird media content belong to same media content—for example, twoportions of a same movie, because, after an extended period of time, thefirst content might no longer accurately reflect what is being displayedon the client 102—a user might have switched from one TV channel toanother after the first media content is obtained by the server 106.

In some implementations, the server 106 further compares (520) the firstmedia content with the third media content, e.g., to determine whetherthe third media content is a subsequent or an earlier portion to, orconcurrent with the first media content. In some implementations, if thethird media content is not received before the predefined expirationtime, the server 106 returns a predefined comparison result (e.g., amismatch).

Alternatively, in some implementations, if the second media contentcorresponds to a portion of the media content that has not yet beenplayed on the client 102 (“No,” 522)—e.g., the second media content hasnot yet been played on the client 102—the client 102 sends (524) fourthmedia content to the server 106. In some implementations, after apredefined time period (e.g., a predefined time interval after thedetermination (512)), the server 106 obtains (524) the fourth mediacontent from the client 102. In some implementations, the predefinedtime interval is sufficient for the media content provided by thecontent feeds 108 to catch up with the second media content. In someimplementations, the predefined time interval is also sufficient toensure that the first and fourth media content belong to same mediacontent—for example, two portions of a same movie, because a user mighthave switched to different media content (e.g., switching from one TVchannel to another) after an extended period of time.

In some implementations, the server 106 then compares (528) the secondmedia content with the fourth media content, e.g., to determine whetherthe fourth media content is a subsequent or an earlier portion to, orconcurrent with the second media content.

In some implementations, when there is a match (e.g., between the firstand third media content, or between the second and fourth mediacontent), the server 106 retrieves meta-data associated with mediacontent received from content feeds 108 (e.g., the third or second mediacontent), and associates the respective meta-data with the media contentreceived from the client 102 (the first or fourth media content), e.g.,to identify what media content a user associated with the client 102 isconsuming.

FIG. 6 includes a flowchart 600 illustrating a method for matching livemedia content in at a server system, in accordance with someimplementations. In some implementations, the method 500 is performed byone or more modules on the server 106.

In some implementations, the server 106 obtains (602) first mediacontent from the client 102. In some implementations, the first mediacontent corresponds (604) to a portion (a predefined number of frames ora predefined length) of media content being played on the client 102. Insome implementations, the first media content corresponds (604) to(e.g., includes or is) a portion of the media content being played onthe client 102—for example, the first 10 seconds of an MP3 song or thefirst 1000 frames of a TV episode, being played on the client 102. Insome implementations, the first media content is also associated with apredefined expiration time, for example, as specified in a timestampassociated with the first media content. In some implementations,predefined expiration time specifies a time frame (e.g., 5, 10 or 15milliseconds) during which the first media is valid for comparison(e.g., with the second or third media content). In some implementations,after the passage of the time frame, as specified in the predefinedexpiration time, the server 106 returns a predefined comparison result(e.g., a mismatch or an error message).

In some implementations, the server 106 also obtains (606) second mediacontent from the client 102. In some implementations, the first mediacontent also corresponds (608) to (e.g., includes or is) a portion ofthe media content being played on the client 102. In someimplementations, the first and second media content are two portions ofthe media content being played on the client device (e.g., two differentportions of the MP3 song or the TV episode). In some implementations,the first media content is a subsequent portion of the media content tothe second portion—for example, the second 10 seconds of an MP3 song orthe second 1000 frames of a TV episode, being played on the client 102.

In some implementations, the first media content and the second mediacontent are obtained at a same or substantially same predefined time, orwithin a predefined time interval of each other (e.g., to account fordifferential network delays between from the client 102 to the server108 and from the content feeds 108 to the client 102). In someimplementations, the server 106 determines whether the second mediacontent corresponds to a portion of the media content that has beenplayed on the client device (e.g., the portion represented by the firstmedia content). For example, because, in some implementations, the firstmedia content is a portion of media content being displayed on theclient 102, the above mentioned determination is made by: comparing thefirst and the second media content, or comparing one or more respectivefingerprints corresponding to the first and the second media content.

In some implementations, determining whether the second media contentcorresponds to a portion of the media content that has been played onthe client device includes 1). obtaining a first content fingerprintusing the first media content, 2). obtaining a third content fingerprintusing the third media content, and 3). comparing the first contentfingerprint with the second fingerprint. In some implementations, acontent fingerprint (e.g., content fingerprint 218-1, FIG. 3) includesinformation that identifies a portion of media content, for example, apredefined number of frames (e.g., video information 324-1, FIG. 3) invideo media content, or a predefined audio clip (e.g., audio information312-1, FIG. 3) in audio media content. In some implementations, acontent fingerprint includes at least one of: average zero cross rate,estimate tempo, average spectrum, spectral flatness, prominent tonescross a set of bands, and bandwidth.

In some implementations, the media content provided by the content feeds108 lags behind the media content provided by the client 102 (e.g., theclient 102 is ahead of the content feeds 108). Therefore, in someimplementations, in accordance with a determination (610) that thesecond media content corresponds to a portion of the media content thathas been played on the client 102, the server 106 obtains (612) thirdmedia content (e.g., a subsequent portion to the second media content),which also corresponds to the media content being played on the clientdevice, from the one or more content feeds 108.

In some implementations, the server 106 obtains the third media contentafter a predefined amount of time, so that media content received fromthe content feeds 108 can catch up with the media content received fromthe client 102. For example, if the first media content received fromthe client 102 is the second 5 seconds of a TV episode, and the secondmedia content received from the content feeds 108 is the first 5 secondsof the same TV episode, the server 106 waits for 5 seconds, beforeobtaining the third media content, which, in many situations, wouldlikely be the second 5 seconds of the TV episode, because after the5-second delay, subsequent media content (e.g., the third media content)received from the content feeds 108 has likely caught up with the mediacontent (e.g., the first media content) received from the client 102. Insome implementations, the third media content is obtained before thepredefined expiration time (e.g., as specified in a timestamp) to ensurethe second and third media content are portion of same media content(e.g., a user has not switched channels). In some implementations, whenthe third media content is not obtained before the predefined expirationtime, the server 106 returns a predefined comparison result, such as amismatch or an error.

In some implementations, after obtaining the third media content, theserver 106 compares (614) the first media content with the third mediacontent. In some implementations, comparing the first media content withthe third media content includes, obtaining and comparing respectivecontent fingerprint(s) for the first and third media content. In someimplementations, the second and third media content are two differentportions of same media content (e.g., two different portions of a sameTV episode), and the third media content is a portion subsequent to thesecond media content. For example, the second media content is the first5 minutes of a TV episode and the third media content is the second 5minutes of the same TV episode. In some implementations, the server 106returns a predefined matching result (e.g., a match) when the first andthird media content substantially match each other. This approaches isbeneficial because it allows matching media content received from theclient 102 with media content received from the content feeds 108, evenwhen the media content from the content feeds 108 lags behind that fromthe client 102 (e.g., the content feeds 108 is slower than the client102).

Alternatively, in some implementations, the media content provided bythe client 102 lags behind that provided by the content feeds 108 (e.g.,the media content 108 is ahead of the client 102). Therefore, in someimplementations, in accordance with a determination (616) that thesecond media content corresponds to a portion of the media content thathas been played on the client device, the server 106, the server 106obtains (618) fourth media content, from the client 102. In someimplementations, the server 106 obtains the fourth media content after apredefined amount of time (e.g., the predefined time interval), so thatmedia content received from the client 102 can catch up with the mediacontent received from the content feeds 108. For example, if the firstmedia content received from the client 102 is the first 5 seconds of aTV episode, and the second media content received from the content feeds108 is the seconds 5 seconds of the same TV episode, the server 106waits for 5 seconds or more, before obtaining from the client 102 thefourth media content, which may, in many situations, likely be thesecond 5 seconds of the TV episode, because after the 5-second delay,subsequent media content received from the client 102 (e.g., the fourthmedia content) has likely caught up with the media content received from(the second media content) the content feeds 108. In someimplementations, when the fourth media content is not obtained before apredefined expiration time, the server 106 returns a predefinedcomparison result, such as a mismatch or an error.

In some implementations, after obtaining the fourth media content, theserver 106 compares (614) the second media content with the fourth mediacontent. In some implementations, comparing the second media contentwith the fourth media content includes, obtaining and comparingrespective content fingerprint(s) for the second and fourth mediacontent. In some implementations, the first and fourth media content aretwo different portions of same media content, and the fourth mediacontent is a portion subsequent to the first media content. For example,the first media content is the first 5 seconds of a TV episode and thefourth media content is the second 5 seconds of the same TV episode. Insome implementations, the server 106 returns a predefined matchingresult (e.g., a match) when the second and fourth media contentsubstantially match each other. These approaches also are beneficialbecause it allows matching media content received from the client 102with media content received from the content feeds 108, when the mediacontent from the client 102 lags behinds that received from the contentfeeds 108 (e.g., the content feeds 108 is faster than the client 102).

In some implementations, the media content provided by the content feeds108 is concurrent with that provided by the client 102, for example,when there is no significant lag between the client 102 and the contentfeeds 108. In some implementations, the server 106 returns a predefinedmatching result (e.g., a match) when the first and second media contentsubstantially match each other.

In some implementations, after one or more comparisons of the mediacontent (the first with the third, the second with the fourth, or thefirst with the second) are completed, and a matching result is received.If there is a match, the server 106, in some implementations, using themeta-data associated with the third or second media content receivedfrom the content feeds 108, to determine identity (e.g., titleinformation) of the first or fourth media content, respectively.

In some implementations, after the identity (e.g., title or authorinformation) of the media content being displayed on the client 102 isdetermined, further actions, such as sending related advertisingmaterials, or recommending related media content (e.g., another song bythe same recording artist), are taken.

The methods illustrated in FIGS. 5, and 6 may be governed byinstructions that are stored in a computer readable storage medium andthat are executed by at least one processor of at least one host device(or at least one server). Each of the operations shown in FIGS. 5, and 6may correspond to instructions stored in a non-transitory computermemory or computer readable storage medium. In various implementations,the non-transitory computer readable storage medium includes a magneticor optical disk storage device, solid state storage devices such asFlash memory, or other non-volatile memory device or devices. Thecomputer readable instructions stored on the non-transitory computerreadable storage medium may be in source code, assembly language code,object code, or other instruction format that is interpreted and/orexecutable by one or more processors.

Plural instances may be provided for components, operations orstructures described herein as a single instance. Finally, boundariesbetween various components, operations, and data stores are somewhatarbitrary, and particular operations are illustrated in the context ofspecific illustrative configurations. Other allocations of functionalityare envisioned and may fall within the scope of the implementation(s).In general, structures and functionality presented as separatecomponents in the example configurations may be implemented as acombined structure or component. Similarly, structures and functionalitypresented as a single component may be implemented as separatecomponents. These and other variations, modifications, additions, andimprovements fall within the scope of the implementation(s).

It will also be understood that, although the terms “first,” “second,”etc. may be used herein to describe various elements, these elementsshould not be limited by these terms. These terms are only used todistinguish one element from another. For example, a first media contentcould be termed a second media content, and, similarly, a second mediacontent could be termed a first media content, which changing themeaning of the description, so long as all occurrences of the “firstmedia content” are renamed consistently and all occurrences of the“second media content” are renamed consistently. The first mediacontent, and the second media content are both media content, but theyare not the same media content.

The terminology used herein is for the purpose of describing particularimplementations only and is not intended to be limiting of the claims.As used in the description of the implementations and the appendedclaims, the singular forms “a”, “an” and “the” are intended to includethe plural forms as well, unless the context clearly indicatesotherwise. It will also be understood that the term “and/or” as usedherein refers to and encompasses any and all possible combinations ofone or more of the associated listed items. It will be furtherunderstood that the terms “comprises” and/or “comprising,” when used inthis specification, specify the presence of stated features, integers,steps, operations, elements, and/or components, but do not preclude thepresence or addition of one or more other features, integers, steps,operations, elements, components, and/or groups thereof.

As used herein, the term “if” may be construed to mean “when” or “upon”or “in response to determining” or “in accordance with a determination”or “in response to detecting,” that a stated condition precedent istrue, depending on the context. Similarly, the phrase “if it isdetermined (that a stated condition precedent is true)” or “if (a statedcondition precedent is true)” or “when (a stated condition precedent istrue)” may be construed to mean “upon determining” or “in response todetermining” or “in accordance with a determination” or “upon detecting”or “in response to detecting” that the stated condition precedent istrue, depending on the context.

The foregoing description included example systems, methods, techniques,instruction sequences, and computing machine program products thatembody illustrative implementations. For purposes of explanation,numerous specific details were set forth in order to provide anunderstanding of various implementations of the inventive subjectmatter. It will be evident, however, to those skilled in the art thatimplementations of the inventive subject matter may be practiced withoutthese specific details. In general, well-known instruction instances,protocols, structures and techniques have not been shown in detail.

The foregoing description, for purpose of explanation, has beendescribed with reference to specific implementations. However, theillustrative discussions above are not intended to be exhaustive or tolimit the implementations to the precise forms disclosed. Manymodifications and variations are possible in view of the aboveteachings. The implementations were chosen and described in order tobest explain the principles and their practical applications, to therebyenable others skilled in the art to best utilize the implementations andvarious implementations with various modifications as are suited to theparticular use contemplated.

What is claimed is:
 1. A method comprising: at a server having one ormore processors and memory storing one or more programs for execution bythe one or more processors: obtaining a first media content item from aclient device, wherein the first media content item corresponds to afirst portion of media content being played on the client device;obtaining a second media content item from a content source distinctfrom the server; comparing the first media content item and the secondmedia content item; in accordance with a determination, based on thecomparing of the first and second media content items, that the secondmedia content item from the content source corresponds to a portion ofthe media content that is earlier than the first media content item fromthe client device: obtaining a third media content item from the contentsource, wherein the third media content item corresponds to a thirdportion of the media content subsequent to the second media contentitem; comparing the first media content item with the third mediacontent item; and in accordance with a determination, based on thecomparing of the first and third media content items, that the firstmedia content item from the client device is concurrent with the thirdmedia content item from the content source: identifying the first mediacontent item from the client device using identification informationcorresponding to the third media content item from the content source.2. The method of claim 1, further comprising: in accordance with adetermination, based on the comparing of the first and second mediacontent items, that the second media content item from the contentsource corresponds to a portion of the media content that is subsequentto the first media content item from the client device: obtaining, fromthe client device, a fourth media content item corresponding to a fourthportion of the media content subsequent to the first media content item;and comparing the second media content item with the fourth mediacontent item; and in accordance with a determination, based on thecomparing of the second and fourth media content items, that the secondmedia content item is concurrent with the fourth media content item fromthe client device: identifying the fourth media content item from theclient device using identification information corresponding to thesecond media content item from the content source.
 3. The method ofclaim 1, further comprising: in accordance with a determination, basedon the comparing of the first and second media content items, that thesecond media content item from the content source corresponds to aportion of the media content that is concurrent with the first mediacontent item from the client device: identifying the first content itemfrom the client device using identification information corresponding tothe second media content item from the content source.
 4. The method ofclaim 1, wherein the first media content item and the second mediacontent item are obtained at a same predefined time.
 5. The method ofclaim 1, further comprising: obtaining a first content fingerprint usingthe first media content item; and obtaining a second content fingerprintusing the third media content item.
 6. The method of claim 5, whereincomparing the first media content item with the third media content itemcomprises: comparing the first content fingerprint with the secondcontent fingerprint.
 7. The method of claim 5, wherein the first contentfingerprint includes at least one of: average zero cross rate, estimatetempo, average spectrum, spectral flatness, prominent tones cross a setof bands, and bandwidth.
 8. The method of claim 1, wherein the mediacontent is one of: a clip of a video or audio steam, or a predefinednumber of frames of a video stream.
 9. A server system, comprising: oneor more processors; memory; and one or more programs, wherein the one ormore programs are stored in the memory and configured to be executed bythe one or more processors, the one or more programs includinginstructions for: obtaining a first media content item from a clientdevice, wherein the first media content item corresponds to a firstportion of media content being played on the client device; obtaining asecond media content item from a content source distinct from theserver; comparing the first media content item and the second mediacontent item; in accordance with a determination, based on the comparingof the first and second media content items, that the second mediacontent item from the content source corresponds to a portion of themedia content that is earlier than the first media content item from theclient device: obtaining a third media content item from the contentsource, wherein the third media content item corresponds to a thirdportion of the media content subsequent to the second media contentitem; comparing the first media content item with the third mediacontent item; and in accordance with a determination, based on thecomparing of the first and third media content items, that the firstmedia content item from the client device is concurrent with the thirdmedia content item from the content source: identifying the first mediacontent item from the client device using identification informationcorresponding to the third media content item from the content source.10. The server system of claim 9, wherein the one or more programscomprise instruction for: in accordance with a determination, based onthe comparing of the first and second media content items, that thesecond media content item from the content source corresponds to aportion of the media content that is subsequent to the first mediacontent item from the client device: obtaining, from the client device,a fourth media content item corresponding to a fourth portion of themedia content subsequent to the first media content item; and comparingthe second media content item with the fourth media content item; and inaccordance with a determination, based on the comparing of the secondand fourth media content items, that the second media content item isconcurrent with the fourth media content item from the client device:identifying the fourth media content item from the client device usingidentification information corresponding to the second media contentitem from the content source.
 11. The server system of claim 9, whereinthe one or more programs comprise instruction for: in accordance with adetermination, based on the comparing of the first and second mediacontent items, that the second media content item from the contentsource corresponds to a portion of the media content that is concurrentwith the first media content item from the client device: identifyingthe first content item from the client device using identificationinformation corresponding to the second media content item from thecontent source.
 12. The server system of claim 9, wherein the firstmedia content item and the second media content item are obtained at asame predefined time.
 13. The server system of claim 9, wherein the oneor more programs comprise instruction for: obtaining a first contentfingerprint using the first media content; and obtaining a secondcontent fingerprint using the third media content.
 14. The server systemof claim 13, wherein comparing the first media content item with thethird media content item comprising: comparing the first contentfingerprint with the second content fingerprint.
 15. A non-transitorycomputer readable storage medium storing one or more programs configuredfor execution by a computer system, the one or more programs includinginstructions for: obtaining a first media content item from a clientdevice, wherein the first media content item corresponds to a firstportion of media content being played on the client device; obtaining asecond media content item from a content source distinct from thecomputer system; comparing the first media content item and the secondmedia content item; in accordance with a determination, based on thecomparing of the first and second media content items, that the secondmedia content item from the content source corresponds to a portion ofthe media content that is earlier than the first media content item fromthe client device: obtaining a third media content item from the contentsource, wherein the third media content item corresponds to a thirdportion of the media content subsequent to the second media contentitem; comparing the first media content item with the third mediacontent item; and in accordance with a determination, based on thecomparing of the first and third media content items, that the firstmedia content item from the client device is concurrent with the thirdmedia content item from the content source: identifying the first mediacontent item from the client device using identification informationcorresponding to the third media content item from the content source.16. The computer readable storage medium of claim 15, wherein the one ormore programs further include instructions for: in accordance with adetermination, based on the comparing of the first and second mediacontent items, that the second media content item from the contentsource corresponds to a portion of the media content that is subsequentto the first media content item from the client device: obtaining, fromthe client device, a fourth media content item corresponding to a fourthportion of the media content subsequent to the first media content item;and comparing the second media content item with the fourth mediacontent item; and in accordance with a determination, based on thecomparing of the second and fourth media content items, that the secondmedia content item is concurrent with the fourth media content item fromthe client device: identifying the fourth media content item from theclient device using identification information corresponding to thesecond media content item from the content source.
 17. The computerreadable storage medium of claim 15, wherein the one or more programsfurther include instructions for: in accordance with a determination,based on the comparing of the first and second media content items, thatthe second media content item from the content source corresponds to aportion of the media content that is concurrent with the first mediacontent item from the client device: identifying the first content itemfrom the client device using identification information corresponding tothe second media content item from the content source.
 18. The computerreadable storage medium of claim 15, wherein the first media contentitem and the second media content item are obtained at a same predefinedtime.
 19. The computer readable storage medium of claim 15, wherein theone or more programs further include instructions for: obtain a firstcontent fingerprint using the first media content item; and obtain asecond content fingerprint using the third media content item.
 20. Thecomputer readable storage medium of claim 19, wherein comparing thefirst media content item with the third media content item comprises:comparing the first content fingerprint with the second contentfingerprint.